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Rhizobium leguminosarum bv. trifolii SRDI943 (strain syn. V2-2) is an aerobic, motile, Gram- 
negative, non-spore-forming rod that was isolated from a root nodule of Trifolium michelianum 
Savi cv. Paradana that had been grown in soil collected from a mixed pasture in Victoria, Austral- 
ia. This isolate was found to have a broad clover host range but was sub-optimal for nitrogen fixa- 
tion with T. subterraneum (fixing 20-54% of reference inoculant strain WSM1325) and was found 
to be totally ineffective with the clover species T. polymorphum and T. pratense. Here we de- 
scribe the features of R. leguminosarum bv. trifolii strain SRDI943, together with genome sequence 
information and annotation. The 7,412,387 bp high-quality -draft genome is arranged into 5 scaf- 
folds of 5 contigs, contains 7,31 7 protein -coding genes and 89 RNA-only encoding genes, and is 
one of 100 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Ge- 
nomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project. 



Introduction 



The availability of usable nitrogen (N) is vital for 
productivity in agricultural systems that are N- 
deficient [1]. It can be supplied exogenously in the 
form of industrially synthesized fertilizers. Howev- 
er, this practice is expensive since fertilizer manu- 
facture depends on the availability of fossil fuels 
that are burnt to support the industrial process of 
chemical N-fixation. A far more economical practice 
is to supply plant-available N to farming systems by 
exploiting the process of biological N-fixation that 
occurs in a symbiotic relationship between legumes 
and their rhizobial microsymbionts [2]. In this spe- 
cific association, atmospheric inert dinitrogen gas is 
converted into bioavailable N to support legume 
growth. 



ing systems throughout the world [3,4]. In Aus- 
tralia, soils with a history of growing Trifolium 
spp. have developed large and symbiotically di- 
verse populations of Rhizobium leguminosarum 
bv. trifolii (/?. /. trifolii) that are able to infect and 
nodulate a range of clover species. The N2-fixation 
capacity of the symbioses established by different 
combinations of clover hosts {Trifolium spp.) and 
strains of R. 1. trifolii can vary from 10 to 130% 
when compared to an effective host-strain combi- 
nation [5-8]. 



R. 1. trifolii strain SRDI943 (syn. V2-2 [9]) was iso- 
lated from a nodule recovered from the roots of 
the annual clover Trifolium michelianum Savi cv. 
Paradana that had been inoculated with soil col- 
lected from under a mixed pasture at Walpeup, 
Victoria, Australia and grown in N deficient media 
for four weeks after inoculation, in the greenhouse 



Pasture legumes, including the clovers that com- 
prise the Trifolium genus, are major contributors 
of biologically fixed nitrogen (N2) to mixed farm- 
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[10]. SRDI943 forms an effective symbiosis with T. 
purpureum but sub-optimal N2-fixation symbiosis 
with T. subterraneum cv. Campeda and Clare (~24 
and 54% respectively of that with strain 
WSM1325 [9,11])- Here we present a preliminary 
description of the general features for R. I. trifolii 
strain SRDI943 together with its genome se- 
quence and annotation. 

Classification and general features 

R. I trifolii strain SRDI943 is a motile, Gram- 
negative rod (Figure 1 Left and Center) in the or- 
der Rhizobiales of the class Alphaproteobacteha. It 
is fast growing, forming colonies within 3-4 days 
when grown on half strength Lupin Agar (VzLA) 
[12] at 28°C. Colonies on VzLA are white-opaque, 
slightly domed and moderately mucoid with 
smooth margins (Figure 1 Right). 

Minimum information about the Genome Se- 
quence (MIGS) is provided in Table 1. Figure 2 
shows the phylogenetic relationship of R. I. trifolii 
strain SRDI943 to root nodule bacteria in the or- 
der Rhizobiales in a 16S rRNA sequence based 
tree. This strain clusters closest to R. I. trifolii T24 
and Rhizobium leguminosarum bv. phaseoli RRE6 
with 100% and 99.8% sequence identity, respec- 
tively. 

Symbio taxonomy 

R. I. trifolii SRDI943 forms nodules on (Nod + ) and 
fixes N2 (Fix + ) with a range of annual and perenni- 



al clover species of Mediterranean origin (Table 
2). SRDI943 forms white, ineffective (Fix ) nodules 
with the perennial clover T. pratense and T. 
polymorphum. 

Genome sequencing and annotation 
information 

Genome project history 

This organism was selected for sequencing on the 
basis of its environmental and agricultural rele- 
vance to issues in global carbon cycling, alterna- 
tive energy production, and biogeochemical im- 
portance, and is part of the Community Sequenc- 
ing Program at the U.S. Department of Energy, 
Joint Genome Institute (JGI) for projects of rele- 
vance to agency missions. The genome sequence is 
deposited in the Genomes OnLine Database 
(GOLD) [33] and an improved-high-quality-draft 
genome sequence in IMG/GEBA. Sequencing, fin- 
ishing and annotation were performed by the JGI. 
A summary of the project information is shown in 
Table 3. 

Growth conditions and DNA isolation 

R. I. trifolii strain SRDI943 was cultured to mid 
logarithmic phase in 60 ml of TY rich media [34] 
on a gyratory shaker at 28°C. DNA was isolated 
from the cells using a CTAB (Cetyl trimethyl am- 
monium bromide) bacterial genomic DNA isola- 
tion method [35]. 




Figure 1. Images of Rhizobium leguminosarum bv. trifolii strain SRDI943 using scanning (Left) and transmission (Cen- 
ter) electron microscopy as well as light microscopy to show the colony morphology on solid media (Right). 
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Table 1. Classification and general features of Rhizobium leguminosarum bv. 


trifolii SRDI943 according to the MIGS 

o 


/" om mpnra t i one 
IcLUI I II I Icl lUclLIUI I j 


L 1 -JJ 






MIGS ID 


Property 


Term 


Evidence code 






Domain Bacteria 


I Ab [ I4j 






Phylum Proteobacteria 


TAS [15] 






Class Alphaproteobacteria 


TAS [16,1 7] 




Current classification 


Order Rhizobiales 


TAC n 7 1 Ql 

I Ab [ I /, I o] 






Family Rhizobiaceae 


TAS [19-21] 






Genus Rhizobium 


TAS [21-2 6] 






Species Rhizobium leguminosarum bv. trifolii TAS [21,23,2 7,28] 




Gram stain 


Negative 


IDA 




Cell shape 


Rod 


IDA 




Motility 


Motile 


IDA 




Sporulation 


Non-sporulating 


NAS 




Temperature range 


Mesophile 


NAS 




Optimum temperature 


28°C 


NAS 




Salinity 


Non-halophile 


NAS 


MIGS-22 


Oxygen requirement 


Aerobic 


TAS [11] 




Carbon source 


Varied 


NAS 




Energy source 


Chemoorganotroph 


NAS 


MIGS-6 


Habitat 


Soil, root nodule, on host 


TAS [9] 


MIGS-15 


Biotic relationship 


Free living, symbiotic 


TAS [9] 


MIGS-14 


Pathogenicity 


Non-pathogenic 


NAS 




Biosafety level 


1 


TAS [29] 




Isolation 


Root nodule 


TAS [9] 


MIGS-4 


Geographic location 


Victoria, Australia 


TAS [9] 


MIGS-5 


Soil collection date 


Dec, 1998 


IDA 


MIGS-4.1 


Longitude 


142.02 62 




MIGS-4.2 


Latitude 


-35.1 3531 


IDA 


MIGS-4.3 


Depth 


0-1 0cm 




MIGS-4.4 


Altitude 


Not recorded 





Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the 
literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based 
on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene 
Ontology project [30]. 
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99 



62 



63 



100 



Rhizobium leguminosarum bv. trifolii SRDI943 (Gi08842) 

Rhizobium leguminosarum bv. trifolii T24 (U31074) 
— Rhizobium leguminosarum bv. phaseoli RRE6 (AY94601 2) 
r Rhizobium miluonense CCBAU 41251 T (EF061096) 
I Rhizobium multihospitium CCBAU 83401 T (EF035074) 



100 M 

68 I 



68 L Rhizobium frop/c/ CIAT899 T (EU488752,Gi05744) 

Rhizobium pisi DSM 301 32 T (AY509899) 
Rhizobium fabae LMG 23997 T (DQ835306) 
Rhizobium phaseoli ATCC 14482 T (EF141340) 
■ Rhizobium etli USDA 9032 T (U2891 6) 
- Rhizobium tibeticum CCBAU 85039 T (EU256404) 



98 r Rhizobium alamii LMG 24466 T (AM931436) 



41' 



HZ 



Rhizobium mesosinicum CCBAU2501 0 (DQ1 00063) 
Rhizobium mongolense USDA1 844 T (U8981 7,Gi08900) 



88 I Rhizobium yang//'ngense SH22623 T (AF003375) 

Rhizobium tubonense CCBAU 85046 T (EU256434) 

Rhizobium soli DS-42 T (EF36371 5) 

Rhizobium loessense CCBAU 7190B T (AF364069) 

Rhizobium galegae gal 1261 T (X67226.Gi09589) 



98 L R/?/zofe/tymwgr?aeCCBAU05176 T (GU128881) 
98 I Rhizobium larrymoorei LMG 21 41 0 T (NR 02651 9) 



- Rhizobium radiobacter ATCC 19358'(AJ389904) 



Rhizobium taibaishanense CCNWSX 0483 T (HM776997) 



82 



86 



100 I — Rhizobium vitis ATCC 49767 T (X67225,Gi15372) 

96 . Ensifer kummerowiae CCBAU 71714 T (AF364067) 

joojl Ensifer meliloti LMG 61 33 T (X67222) 

- Ensifermedicae LMG 19920 T (L39882) 

Ensifer fredii LMG 62 1 7 T (X6723 1 ) 

79 I Ensifer xinjiangense LMG 1 7930 T (D1 2796) 

Ensiferterangae LMG 7834 T (X68388) 

69 r Mesorhizobium gobiense LMG 23949 T (EF035064) 

Mesorhizobium loti USDA 3471 T (X67229,Gi08881) 

I Mesorhizobium septentrionale SDW 01 4 T (AF508207) 

J t 



99 



65 



100 



- Mesorhizobium plurifarium LMG 1 1 892 (Y1 41 58) 

L Mesorhizobium huakuii LMG 1 4 1 07 T ( D 1 343 1 ) 
Mesorhizobium opportunistum WSM2075 1 " (CP002279, Gc01853)* 
Bradyrhizobium elkanii USDA 76 T (AF362942, Gi08850) 

I Bradyrhizobium canariense LMG 22265 T (AJ558025) 



100 



r Bradyrhizobium liaoningense LMG 18230 T (AJ250813) 
I I 



■ Bradyrhizobium yuanmingense LMG 21 827 T (AF1 9381 8) 
- Azorhizobium caulinodans ORS 571 T (X67221 , Gc00669) 



0.01 



Figure 2. Phylogenetic tree showing the relationship of Rhizobium leguminosarum bv. trifolii SRDI943 
(shown in blue print) with some of the root nodule bacteria in the order Rhizobiales based on aligned se- 
quences of the 16S rRNA gene (1,307 bp internal region). All sites were informative and there were no gap- 
containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [31]. The tree was built 
using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis [32] with 
500 replicates was performed to assess the support of the clusters. Type strains are indicated with a super- 
script T. Strains with a genome sequencing project registered in GOLD [33] are in bold print and the GOLD 
ID is mentioned after the accession number. Published genomes are indicated with an asterisk. 
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Table 2. Compatibility of SRDI943 with eleven Trifolium genotypes for nodulation 


(Nod) and N 2 - 


-Fixation (Fix) 


Snpcips Namp 


Cultivar 

U 1 LI V It 1 


Common Namp 


CiKowth Tvnp 


Nod 


Fix 


Reference 


7 pIa nd i il ifpn im Rni^ 

1 . clCl; /U ui // C( LA 1 1 1 UUI JJ^ 


Pri ma 


Gland 


Annua 1 

/ \ 1 II 1 uo. 1 


-i- 






7. michelianum Savi. 


Bolta 


Balansa 


Annual 


+ 


+ 




7 ni irni itpi im 1 oisel 


P a ra tta 

1 Cl 1 Cl L LCI 


Purp le 


Annua 1 

/ \ 1 II 1 \-A Cl 1 


+ 


+ 


[11] 


7. resupinatum L. 


Kyambro 


Persian 


Annual 


+ 


+ 




7 si ihtp rrpi hp / im 1 


C 7\ mnpdpi 

Cl 1 1 IUCUO 


Sub rlnvpr 


Anni ipi 1 

/ \ 1 II 1 mci 1 






[9,11] 


7". subterraneum L. 


Clare 


Sub rlover 


Annual 


+ 


+ 


[9,11] 


7". vesiculos um Savi. 


Arrotas 


Arrowleaf 


Annual 


+ 


+ 




7. fragiferum L. 


Palestine 


Strawberry 


Perennial 


+ 


+ 




7. polymorphum Poir 


Acc.#087102 


Polymorphous 


Perennial 


+(w) 




[11] 


7. pratense L. 




Red 


Perennial 


+(w) 






7. re pens L. 


Haifa 


White 


Perennial 


+ 


+ 





(w) indicates nodules present were white. 



Table 3. Genome sequencing project information for Rhizobium leguminosarum bv. trifolii strain SRDI943 . 
MIGS ID Property Term 



MIGS-31 

MIGS-28 

MIGS-29 

MIGS-31. 2 

MIGS-30 

MIGS-32 



Finishing quality 
Libraries used 
Sequencing platforms 
Sequencing coverage 
Assemblers 
Gene calling methods 
GOLD ID 
NCBI project ID 
Database: IMG 
Project relevance 



Improved high-quality draft 

2x lllumina libraries; Std short PE & CLIP long PE 
lllumina HiSeq2000 
lllumina (761 x) 

Velvet 1.1.05, phrap SPS-4.24, Allpaths version 39750 

Prodigal 1.4, GenePRIMP 

Gi08842 

89687 

2517093000 

Symbiotic N ? fixation, agriculture 
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Genome sequencing and assembly 

The genome of R. I thfolii strain SRDI943 was se- 
quenced at the Joint Genome Institute (JGI) using 
an Illumina sequencing platform. An Illumina 
short-insert paired-end (PE) library with an aver- 
age insert size of 270 bp produced 18,764,470 
reads and an Illumina CLIP long-insert paired-end 
(PE) library with an average insert size of 9,482 
bp produced 18,761,080 reads totaling 5,629 Mb 
of Illumina data for this genome. All general as- 
pects of library construction and sequencing per- 
formed at the JGI can be found at the DOE JGI user 
homepage [35]. The initial draft assembly con- 
tained 5 contigs in 5 scaffolds. The initial draft da- 
ta was assembled with Allpaths, version 39750. 
The Allpaths consensus was computationally 
shredded into 10 Kb overlapping fake reads 
(shreds). Illumina sequencing data were assem- 
bled with Velvet, version 1.1.05 [36], and the con- 
sensus sequences were computationally shredded 
into 1.5 kb overlapping fake reads (shreds). The 
Allpaths consensus shreds, the Illumina VELVET 
consensus shreds and a sub-set of the Illumina 
CLIP paired-end reads were integrated using par- 
allel phrap, version SPS - 4.24 (High Performance 
Software, LLC). The software Consed [37-39] was 
used in the following finishing process. The esti- 
mated genome size is 7.4 Mb and the final assem- 
bly is based on 5,629 Mb of Illumina draft data 
which provides an average of 761x coverage of 
the genome. 



Genome annotation 

Genes were identified using Prodigal [40] as part 
of the DOE-JGI annotation pipeline [41] annota- 
tion pipeline, followed by a round of manual 
curation using the JGI GenePRIMP pipeline [42]. 
The predicted CDSs were translated and used to 
search the National Center for Biotechnology In- 
formation (NCBI) non-redundant database, 
UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and 
InterPro databases. These data sources were 
combined to ascribe a product description for 
each predicted protein. Non-coding genes and 
miscellaneous features were predicted using 
tRNAscan-SE [43], RNAMMer [44], Rfam [45], 
TMHMM [46], and SignalP [47]. Additional gene 
prediction analyses and functional annotation 
were performed within the Integrated Microbial 
Genomes (IMG-ER) platform [35,48]. 

Genome properties 

The genome is 7,412,387 nucleotides with 60.69% 
GC content (Table 4) and comprised of 5 scaffolds 
(Figure 3) of 5 contigs. From a total of 7,406 
genes, 7,317 were protein encoding and 89 RNA 
only encoding genes. The majority of genes 
(78.5%) were assigned a putative function whilst 
the remaining genes were annotated as hypothet- 
ical. The distribution of genes into COGs functional 
categories is presented in Table 5. 



Table 4. Genome Statistics for Rhizobium leguminosarum bv. 


£r/fo///SRDI943 


Attribute 


Value 


% of Total 


Genome size (bp) 


7,412,387 


100.00 


DNA coding region (bp) 


6, 395,342 


86.28 


DNA G+C content (bp) 


4,498,817 


60.69 


Number of scaffolds 


5 




Number of contigs 


5 




Total gene 


7,406 


100.00 


RNA genes 


89 


1.20 


rRNA operons 


3 




Protein-coding genes 


7,317 


98.80 


Genes with function prediction 


5,814 


78.50 


Genes assigned to COGs 


5,770 


77.91 


Genes assigned Pfam domains 


6,032 


81.45 


Genes with signal peptides 


631 


8.52 


Genes with transmembrane proteins 


1,618 


21.85 


CRISPR repeats 


0 
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Figure 3. Graphical map of the genome of Rhizobium leguminosarum 
bv. trifolii strain SRDI943. From bottom to the top of each scaffold: 
Genes on forward strand (color by COG categories as denoted by the 
IMG platform), Genes on reverse strand (color by COG categories), 
RNA genes (tRNAs green, sRNAs red, other RNAs black), GC content, 
GC skew. 
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Table 5. Number of protein coding genes of Rhizobium leguminosarum bv. trrfolii SRDI 94 3 
associated with the general COG functional categories. 



Code 


Value 


%age 


COG Category 


J 


196 


3.03 


Translation, ribosomal structure and biogenesis 


A 


1 


0.02 


RNA processing and modification 


K 


652 


10.06 


Transcription 


L 


231 


3.57 


Replication, recombination and repair 


B 


2 


0.03 


Chromatin structure and dynamics 


D 


40 


0.62 


Cell cycle control, mitosis and meiosis 


Y 


0 


0.00 


Nuclear structure 


V 


76 


1.17 


Defense mechanisms 


T 


373 


5.76 


Signal transduction mechanisms 


M 


334 


5.16 


Cell wall/membrane biogenesis 


N 


92 


1.42 


Cell motility 


Z 


1 


0.02 


Cytoskeleton 


w 


1 


0.02 


Extracellular structures 


u 


95 


1.47 


Intracellular trafficking and secretion 


o 


193 


2.98 


Posttranslational modification, protein turnover, chaperones 


c 


324 


5.00 


Energy production conversion 


G 


714 


11.02 


Carbohydrate transport and metabolism 


E 


659 


10.17 


Amino acid transport metabolism 


F 


109 


1.68 


Nucleotide transport and metabolism 


H 


192 


2.96 


Coenzyme transport and metabolism 


1 


227 


3.50 


Lipid transport and metabolism 


P 


333 


5.14 


Inorganic ion transport and metabolism 


Q 

< 


165 


2.55 


Secondary metabolite biosynthesis, transport and catabolism 


R 


842 


13.00 


General function prediction only 


S 


62 7 


9.68 


Function unknown 




1,636 


22.09 


Not in COGS 
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